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ABSTRACT 



An item response theory-based parametric procedure proposed 
by N. S. Raju, W. J. van der Linden, and P. F. Fleer (1995) known as 
differential functioning of items and tests (DFIT) can be used with 
unidimensional and multidimensional data with dichotomous or polytomous 
scoring. This study describes the polytomous DFIT framework and evaluates and 
compares its performance to that of the extension of the SIBTEST procedure 
developed by R. Shealy and W. Stout (1993) and extended Lord's chi-square. 
Using simulated data, the effects of sample size (500 and 1,000 examinees), 
focal group distribution (N(0,1) and N(-l,l)) number of differential item 
functioning (DIF) items (0%, 10%, and 20%), magnitude of DIF, and the value 
of a-parameter were evaluated. Overall, the DFIT framework performed well. 
Type I error rates were affected by the number of DIF items, magnitude of 
DIF, and the value of the a-parameters . The DIF detection rates were affected 
by all the factors in the study. Future directions for research are 
discussed. (Contains 5 tables and 16 references.) (Author/SLD) 
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ABSTRACT 

An IRT-based, parametric procedure proposed by Raju, van der Linden, and Fleer (1995), 
known as differential functioning of items and tests (DFIT), can be used with unidimensional and 
multidimensional data with dichotomous and/or polytomous scoring. The purpose of this study is 
to describe the polytomous DFIT framework and evaluate and compare its performance to the 
extension of Shealy and Stout’s (1993) SJBTEST and the extended Lord’s chi-square. Using 
simulated data, the effects of sample size (500 and 1000 examinees), focal group distribution 
(N(0,1) and N(-l,l)), number of DIF items (0%, 10%, and 20%), magnitude of DIF, and value of 
a-parameter were evaluated. Overall, the DFIT framework performed well. Type I error rates 
were affected by the number of DIF items, magnitude of DIF, and the value of the a-parameters. 
The DIF detection rates were affected by all the factors in the study. Future directions for 
research are discussed. 
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THE RELATIONSHIP BETWEEN POLYTOMOUS DFIT AND 
OTHER POLYTOMOUS DIF PROCEDURES 

The increased use of polytomously-scored items on tests has stimulated interest in 
polytomous differential item functioning (DIF) procedures. Many polytomous DIF procedures 
have been proposed over the last four years (e.g., combined t tests (Welch & Hoover, 1993); an 
extension of Shealy and Stout’s (1993) SIBTEST procedure (Chang, Mazzeo, & Roussos, 1996; 
Mazzeo & Chang, 1994); an application of logistic discrimination function analysis (Miller & 
Spray, 1993); logistic regression approaches (Rogers & Swaminathan, 1994); extensions of 
Lord’s chi-square, signed area, and unsigned area (Cohen, A.S., Kim, S., & Baker, F.B., 1993)). 

An IRT-based, parametric procedure proposed by Raju, van der Linden, and Fleer (1995), 
known as differential functioning of items and tests (DFIT), can be used with unidimensional and 
multidimensional data with dichotomous and/or polytomous scoring. The purpose of this study is 
to describe the polytomous DFIT framework and compare its performance to the extension of 
Shealy and Stout’s (1993) SIBTEST (Chang, Mazzeo, & Roussos, 1996; Mazzeo & Chang, 

1994) and the extended Lord’s chi-square (Cohen, A.S., Kim, S., & Baker, F.B., 1993). Both 
DFIT and SIBTEST have differential test functioning (DTF) procedures, but they will not be 
examined in this study. The first section of this study provides a definition of DIF. The second 
section describes Raju et al. (1995) polytomous-DFIT framework. A brief explanation is 
provided of the extensions of SIBTEST and Lord’s chi-square. The third section presents the 
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results of a simulation. The final section summarizes the finding and suggests direction for future 



research. 



Definition of DIF 

Chang and Mazzeo (1994) demonstrated that for the graded response model (Samejima, 

1 969), partial credit model (Masters, 1 982), and generalized partial credit model (Muraki, 1 992), 
if two items have the same item response functions (IRF) then they must have the same number 
of scoring categories and same item category response functions (ICRF). An IRF for a 
polytomous item can be expressed as 

E g [Y\Q] = 2kP k /Q) (1) 

where E R [Y\ 6] is the item expected score (Y) for group g at a given 6 level, P( 6) is the ICRF for 
group g with a category score of k. In other words, the expected item score is a weighted sum of 
the ICRFs. The null DEF hypothesis would be 

E r [Y\Q] = £ f [y|0] (2) 

where E R [Y\ 6] is the item expected score for an examinee in the reference group with a given 6 
and E F [Y\ 6] is the item expected score for an examinee in the focal group with a given 6. 

Polytomous DIF Procedures 

Potenza and Dorans (1994) proposed a framework for classifying polytomous DIF 
procedures. First, they distinguished between procedures that use an observed score as a 
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matching variable and procedures that match groups in terms of an estimate of a latent variable. 



Secondly, they distinguish between parametric approaches that assume a parametric functional 
form for the item response function (IRF) and procedures that do not make such assumptions 
(i.e., nonparametric approaches). 

Using this classification system, all the DEF procedures in this study use an estimate of 
the latent variable as a matching variable. The only difference between the procedures is that 
DFIT and Lord’s chi-square are parametric approaches (i.e., require IRT ability and item 
parameter estimations) and the extension of SIBTEST is a nonparametric approach. 

The Polvtomous DFIT Framework 

Raju et al. (1995) suggest that for polytomously-scored data an expected score (ES„) for 
item / can be computed for examinee s as 



where X ik is the score or weight for category k; m is the number of categories; and P ik is the 
probability of responding to category k (similar to Equation 1). Summing the expected item 
scores across a test will result in the true test score function for each examinee as 



m 




( 3 ) 



n 



T 



3 



E ES , 

, - si 



( 4 ) 



where n is the number of items in the test. The null hypothesis for DTF would be 
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(5) 



where T R and T F are the true test scores for the reference and focal group examinees with the 
same 6 \ respectively. 

The difference between the dichotomous and polytomous DFIT framework is the 
calculation of the item true score. The item true score must accommodate the multiple categories 
in the polytomous model (see Equation 3). Once the true item and test scores are known, the 
DFIT framework for the polytomous framework is identical to the DFIT framework for the 
dichotomous case. 

According to Raju et al. (1995), a measure of DTF at the examinee level may be defined 



as 




( 6 ) 



DTF across the focal group examinees may be defined as 




(7) 



F 



F 



or, equivalently, 



DTF = 



fD s 2 f F (Q)dQ 



( 8 ) 



e 



where fp(6) is the density function of dfor the focal group. Also, 
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DTF = O 2 + (p rF 





(9) 



where is the mean true score for the focal group examinees; is the mean true score for the 
same examinees as if they were members of the reference group; and a D 2 is the variance of D. 
Differential functioning at the item level can be derived from Equation 7. If 



d 



Si 



ES 



siF 



- ES 



siR 



( 10 ) 



then 



DTF = e[ ( i d 3l ) 2 ] 

i= 1 



OD 



where n is the number of items in a test. This can be rewritten as 

n 

DTF = 2 [Cov(d , D) + p p D ] (12) 

i= l 1 

where Cov(dj,D) is the covariance of the difference in expected item scores (d t ) and the difference 
in true scores (£>), and /i di and /i D are the means of d u and D s , respectively. In this case DIF can be 
written as 



DIF i = Cov[d i , D) + p di W 



(13) 
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Raju et al. (1995) refer to this DIF as compensatory DIF (CDIF). If DIF in Equation 13 was 
expressed as CDIF, then Equation 12 can be rewritten as 

n 

DTF = E CDIF. . (14) 

i=i 

The additive nature of DTF allows for possible cancellation at the test level. This occurs 
when one item displays DIF in favor of one group and another item displays DIF in favor of the 
other group. This combination of DIF items will have a canceling effect on the overall DTF. The 
sum of the CDIF indices reflects the net directionality. 

Raju et al. (1995) proposed a second index, named noncompensatory DIF (NCDEF) that 
assumes that all items other than the one under study are free from differential functioning. In the 
dichotomous case, NCDIF is closely related to other existing DIF indices such as Lord’s chi- 
square and the unsigned area (Raju et al., 1995). If all other items are DIF free, then dj = 0 for all 
j * i where i is the item being studied and Equation 13 can be rewritten as 

NCDIF L = O dj 2 + p dj 2 . (15) 

Raju et al. (1995) noted that items having significant NCDEF do not necessarily have significant 
CDIF in the sense of contributing significantly to DTF. For example, if one item favors the 
reference group and another item favors the focal group, significant NCDIF occurs for both 
items even though the two CDIF indices may not be significant because of their canceling effect 
at the test level. This could lead to a greater number of significant NCDIF items than CDIF 
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Statistical significance testing can be performed but these tests have been shown to be 
overly sensitive for large sample sizes (Fleer, 1993). Fleer suggested empirically establishing a 
critical (cutoff) value for NCDEF. This critical value was determined from a Monte Carlo study 
of non-DIF items. 

Extension of SIBTEST 

Chang, Mazzeo, and Roussos (1996) provides a detailed explanation of the extension of 
the SIBTEST. The amount of DIF is measured by 

5 O (0) h ^[y|0] - £ f [T|0] (16) 

Shealy and Stout (1993) provide a global index of DIF as 

P = J 5 o (0y F (0)rf0. (17) 

This is interpreted as the expected amount of DIF experienced by a randomly selected focal 
group examinee. 

Two minor modifications to the original SIBTEST are needed to accommodate 
polytomous data: (1) replacement of n (i.e., number of items) in the SIBTEST test statistic 
(Shealy & Stout, 1993) with n h (maximum test score) and (2) modify the matching test reliability 
estimates used by Shealy and Stout in their regression correction, substituting Cronbach’s alpha 
for KR 20 (Chang, Mazzeo, & Roussos, 1996). 

Lord’s Chi-Square 

Lord’s chi-square simultaneously tests the difference between the a and b-parameters for 
each group. In the dichotomous case, a vector of differences between the parameters is 
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calculated. A similar method is used in the polytomous case with the exception that a larger 
number of elements would be included in the vector because of the multiple b-parameters in the 
polytomous model. Cohen, Kim, and Baker (1993) offer the following extension of the 
polytomous Lord’s chi-square: 

~ ^ a jR ~ °JF * bjlF ~ b jlR * ••• * ^/(m - 1 )F ~ (18) 

where j is the item under study, and m is the number of categories. Then 

i = (i9) 

where Sj is the variance-covariance matrix of the difference between item parameters. There are 
m degrees of freedom for this extension of Lord’s chi-square. 

Simulation Study 

To evaluate the performance of the DFIT framework, a simulations study was conducted. 
The DIF procedure (NCDIF) was the only DFIT index evaluated in this study. The results from 
DFIT were compared to the extensions of SIBTEST and Lord’s chi-square. The simulation 
represented a 20-item test with all items having five-category responses. Item response data were 
generated using the graded response model. The reference group item parameters are contained 
in Table 1. 



Insert Table 1 about here 
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Factors Examined 

DIF and null DIF modeling. DIF was modeled by adding a constant to the b-parameters 
of the focal group (i.e., b ikF = b jkR + C. k where i is the item and k is the item boundary). For the 
null condition, C ik was equal to zero. For the DIF conditions, two magnitudes of DIF were 
embedded: (1) C ik = .10 and (2) C jk = .25. The number of DIF items was also varied across test 
conditions: (1) two DIF items and (2) four DIF items. For the two-DIF items conditions, items 4 
and 17 were embedded with DIF. For the four-items DIF conditions, items 1,4, 10, and 17 were 
embedded with DIF. It should be noted that the b-parameters across these items are fixed (i.e., 
have the same value) but the a-parameters vary across items (i.e., item 1 = .55, item 4 = .75, item 
10= 1.00, and item 17 = 1.36). 

Other factors . Two sample sizes were simulated. In one condition, the focal and reference 
groups each had 500 examinees, and for the other condition, the focal and reference groups each 
had 1000 examinees. Additionally, two focal group ability distributions were simulated: N(0,1) 
and N(-l,l). The reference group ability distribution was N(0,1) for all simulation conditions. 
Simulation under each factor combination was iterated 100 times. The nominal alpha used for 
detecting DIF was 0.05. 

Calculation of DIF Indices 

Both DFIT and Lord’s chi-square calculations required item and ability estimations as 
well as an equating procedure. All item and ability parameters were estimated using the 
computer program PARS C ALE 2 (Muraki & Bock, 1993). The maximum marginal likelihood 
procedure and EM algorithm were used to estimate the item parameters. Default values were 
used for all estimations. Estimation of underlying abilities were made using Bayesian EAP 
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procedure which incorporates normal priors. The estimation of equating coefficients was made 
by means of Baker's modified test characteristic curve method as implemented in the EQUATE 
2.0 computer program (Baker, 1993). In this study, all parameter estimates for the reference 
group were equated to the underlying metric of the focal group. A Fortran program written by 
Raju (1995) was used to calculate the DFIT indices. A Fortran program written by Kim (1993) 
was used to calculate Lord’s chi-square. 

Recall that DFIT statistical test are overly sensitive to large sample sizes. Critical (cutoff) 
values were established independently of the current study by simulating 2000 nonDIF items and 
noting the value at the 95th percentile. The critical values used in this study were .011 for the 500 
examinee condition and .05 for the 1000 examinee condition. This would be equivalent to a 
nominal alpha of .05. 

A computer program, PSIBTEST, written by Roussos, Shealy, and Chang (1993) was 
used to detect DIF for the extension of SIBTEST. This program does not require the estimation 
of item parameters or equating. 



Results 

The results will be divided into two sections: Type I error rate and DIF detection rate. 
Five effects are discussed in each section: (1) number of examinees (500 and 1000); (2) focal 
group distribution (N(0,1) and N(-l,l)); (3) number of DIF items (0, 2, and 4); (4) magnitude of 
DIF (.10 and .25); (5) item discrimination (.55, .75, 1.00, and 1.36). 

Comparisons of the effectiveness between the DIF indices should not be made. As 
mentioned previously, critical values for detecting DIF in DFIT were established using empirical 
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data. The performance of SIBTEST and Lord’s chi-square would improve if this method was 
used in establishing critical values for these DIF procedures. This study focuses on the effects of 
the factors being manipulated on the performance of the DIF indices. 

Type I Error Rate 

Table 2 contains the average Type I error rate for all conditions. 



Insert Table 2 about here 



The effects of sample size can be examined by looking across conditions in Table 2. The 
sample size had little effect on the NCDIF error rate. Most conditions were close to the alpha 
level (i.e., .05). The only exception was in the condition with the greatest number of DIF items 
(four items) and the greatest magnitude of DIF (.25). In this condition the Type I error rate 
increased (ranging from .09 to .14) with the 1000 examinee conditions having the greatest error 
rate. Sample size had little effect on SIBTEST when the reference and focal group had equal 
ability distributions (Focal :N(0,1)) but the error rate almost doubled in the 1000 examinee 
condition with a focal group distribution of N(-l,l). Lord’s chi-square had higher rate than 
expected in all conditions with the 1000 examinee condition usually having the highest error rate. 

NCDIF was not affected by the focal group distribution. The Type I error rates were 
almost identical in all conditions. Focal group distribution had the most noticeable effect on 
SIBTEST. In all canteens the Type I error rate increased in the Focal :N(- 1,1) condition. This is 
consistent with previous studies which showed that there is an over-regression-correction 
(Chang, Mazzeo, & Roussos, 1996) when focal and reference group distributions are not 
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equivalent. Lord’s chi-square had a slight increase in the error rate for the Focal:N(-l,l) in 
almost all conditions. 

All indices showed an increase error rate as the number of DIF items and the magnitude 
of the DIF increased. As expected, the condition with the most DIF items and the greatest 
magnitude had the largest Type I error rate. 

Table 3 contains the Type I error rate for the studied items (i.e., items 1, 4, 10, and 17) in 
the null condition. The value of the a-parameters had an effect on the Type I error rate. For 
NCDIF, smaller a-parameters resulted in higher Type I error rates. No trend was noted with 
SIBTEST or Lord’s chi-square. 



Insert Table 3 about here 



DIF Detection Rate 

Table 4 contains the average DIF detection rate for all conditions. 



Insert Table 4 about here 



For all the DIF indices, in almost all conditions, the detection rate was consistently higher 
in the 1000 examinee conditions than the 500 examinee conditions. NCDIF and SIBTEST had 
consistently lower detection rates in the Focal:N(-l,l) than the Focal:N(0,l). This pattern was not 
noted for Lord’s chi-square. For all indices, the average detection rate decreased as the number of 
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DDF items increased. As expected, all the indices had a much higher detection rate for the .25 
condition than the . 1 0 condition. 

Table 5 contains the detection rate by the studied items. When the magnitude of DDF was 
.25, high discriminating items had a better detection rate for NCDDF. This trend was not noted in 
the .10 magnitude condition. Both SEBTEST and Lord’s chi-square had higher detection rates as 
the item discrimination increased in all conditions. 



Insert Table 5 about here 



Summary 

This study supports the validity of the DFIT frameworks in detecting DDF in polytomous 
data. The Type I error rates were close to nominal alpha level except when the number of DDF 
items (i.e., 20% of the test) and the magnitude of DDF was highest. This is typically true for most 
DDF procedures. The only other factor that affected the Type I error rate was the value of the a- 
parameters; that is, lower a-parameters had higher Type I error rates. The DDF detection rate was 
affected by all the factors in this study. DFIT detects DDF items better for; (1) larger sample 
sizes, (2) equivalent focal and reference group distributions, (3) fewer DDF items in a test, (4) 
greater magnitude of DDF, and (5) larger a-parameter values. 

The results of this study encourage further research of the DFIT framework. First, a 
statistical test that is not as sensitive to sample size needs to be developed. Second, the DTF 
procedure (i.e., CDDF) needs to be investigated. Third, the DFIT framework needs to be applied 
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to a mixed test format (i.e., dichotomous and polytomous items). Fourth, the ability to detect 
different types of DIF (i.e., uniform and nonuniform) needs to be examined. 
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Table 1 

Reference Group Item Parameters Used in Simulation Study 



Item 

Number % bj, b i2 b i3 b j4 



1 


0.55 


o 

OO 

l-H 

1 


- 0.60 


0.60 


1.80 


2 


0.73 


- 2.32 


- 1.12 


0.08 


1.28 


3 


0.73 


- 1.80 


- 0.60 


0.60 


1.80 


4 


0.73 


- 1.80 


- 0.60 


0.60 


1.80 


5 


0.73 


- 1.28 


00 

o 

d 

i 


1.12 


2.32 


6 


1.00 


- 2.78 


- 1.58 


- 0.38 


0.82 


7 


1.00 


- 2.32 


- 1.12 


0.08 


1.28 


8 


1.00 


- 2.32 


- 1.12 


0.08 


1.28 


9 


1.00 


- 1.80 


- 0.60 


0.60 


1.80 


10 


1.00 


- 1.80 


- 0.60 


0.60 


1.80 


11 


1.00 


- 1.80 


- 0.60 


0.60 


1.80 


12 


1.00 


- 1.80 


- 0.60 


0.60 


1.80 


13 


1.00 


- 1.28 


- 0.08 


1.12 


2.32 


14 


1.00 


- 1.28 


i 

o 

d 

00 


1.12 


2.32 


15 


1.00 


- 0.82 


0.38 


1.58 


2.78 


16 


1.36 


- 2.32 


- 1.12 


0.08 


1.28 


17 


1.36 


- 1.80 


- 0.60 


0.60 


1.80 


18 


1.36 


- 1.80 


- 0.60 


0.60 


1.80 


19 


1.36 


- 1.28 


- 0.08 


1.12 


2.32 


20 


1 . 80 . 


- 1.80 


- 0.60 


0.60 


1.80 
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Table 2 

Type I Error Rate (a = .05) for All Conditions 





NCDIF 


SIB TEST 


Lord’s x 2 




Number of 


Number of 


Number of 




Examinees 


Examinees 


Examinees 


Condition 


500 1000 


500 1000 


500 1001 



Null 

Focal:N(0,l) 


.06 


.04 


.05 


.05 


.13 


.08 


Focal:N(-l,l) 


.05 


.04 


.08 


.15 


.12 


.12 


Constant .10 
2 DIF Items 


Focal:N(0,l) 


.05 


.06 


.05 


.06 


.12 


.13 


Focal :N(- 1,1) 


.05 


.06 


.09 


.19 


.13 


.15 


4 DIF Items 


Focal :N(0,1) 


.06 


.06 


.06 


.06 


.13 


.15 


Focal:N(-l,l) 


.06 


.07 


.12 


.22 


.16 


.17 


Constant .25 
2 DIF Items 


Focal:N(0,l) 


.07 


.07 


.07 


.08 


.14 


.17 


Focal:N(-l,l) 


.07 


.06 


.12 


.25 


.17 


.17 


4 DIF Items 


Focal:N(0,l) 


.09 


.14 


.12 


.15 


.22 


.30 


Focal :N(- 1,1) 


.10 


.12 


.17 


.37 


.24 


.31 
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Table 3 

Type I Error Rate (a = .05) for Studied Item in the Null Condition 



NCDIF 



Number of 
Examinees 



SD3TEST 



Number of 
Examinees 



Lord’s x 2 



Number of 
Examinees 



Condition 


a-parameter 


500 


1000 


500 


1000 


500 


1000 


Null 

Focal:N(0,l) 
Item 1 


.55 


.15 


.16 


.05 


.03 


.10 


.07 


Item 4 


.73 


.09 


.08 


.06 


.06 


.09 


.09 


Item 10 


1.00 


.08 


.02 


.02 


.03 


.15 


.07 


Item 17 


1.36 


.04 


.00 


.04 


.05 


.15 


.09 



Focal:N(-l,l) 
Item 1 


.55 


.12 


.11 


.05 


.10 


.07 


.05 


Item 4 


.73 


.14 


.10 


.08 


.11 


.16 


.17 


Item 10 


1.00 


.02 


.02 


.08 


.15 


.11 


.13 


Item 17 


1.36 


.04 


.03 


.12 


.12 


.12 


.14 
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Table 4 

Average DIF Detection Rate for Each Condition (a = .051 





NCDEF 


SIB TEST 


Lord’s x 2 




Number of 


Number of 


Number of 




Examinees 


Examinees 


Examinees 


Condition 


500 1000 


500 1000 


500 1001 



Constant .10 



2 DIF Items 



Focal:N(0,l) 


.30 


.44 


.30 


.38 


.43 


.63 


Focal:N(-l,l) 


.22 


.42 


.15 


.15 


.40 


.62 


4 DIF Items 


Focal:N(0,l) 


.25 


.40 


.21 


.28 


.34 


.52 


Focal:N(-l,l) 


.20 


.30 


.12 


.10 


.34 


.47 



Constant .25 



2 DIF Items 



Focal:N(0,l) 


.91 


' .98 


.85 


.94 


.91 


.98 


Focal:N(-l,l) 


.87 


.98 


.72 


.66 


.93 


.98 


4 DIF Items 


Focal:N(0,l) 


.78 


.94 


.72 


.80 


.79 


.92 


Focal:N(-l,l) 


.73 


.94 


.54 


.48 


.79 


.94 




2 



3 



1 
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Table 5 

Detection Rate bv DIF Item (a = .05) 





NCDIF 


SDBTEST 


Lord’s x 2 




Number of 


Number of 


Number of 




Examinees 


Examinees 


Examinees 


Condition 


500 1000 


500 1000 


500 100< 



Constant .10 

2 DIF Items 



Focal:N(0,l) 


Item 4 


.35 


.45 


.19 


.22 


.32 


.46 




Item 17 


.24 


.42 


.40 


.53 


.54 


.79 


Focal:N(-l,l) 


Item 4 


.25 


.38 


.10 


.13 


.29 


.43 




Item 17 


.18 


.46 


.19 


.16 


.50 


.80 


4 DEF Items 


Focal:N(0,l) 


Item 1 


.24 


.41 


.10 


.16 


.14 


.34 




Item 4 


.29 


.42 


.14 


.13 


.30 


.44 




Item 10 


.24 


.41 


.24 


.34 


.39 


.54 




Item 17 


.22 


.34 


.35 


.47 


.51 


.74 


Focal:N(-l,l) 


Item 1 


.16 


.27 


.06 


.08 


.16 


.21 




Item 4 


.26 


.26 


.09 


.10 


.30 


.32 




Item 10 


.20 


.39 


.16 


.09 


.41 


.58 




Item 17 


.16 


.29 


.15 


.12 


.50 


.76 


Constant .25 
2 DIF Items 


Focal:N(0,l) 


Item 4 


.82 


.96 


.71 


.87 


.81 


.95 




Item 17 


1.00 


1.00 


.98 


1.00 


1.00 


1.00 


Focal:N(-l,l) 


Item 4 


.81 


.96 


.52 


.44 


.87 


.95 




Item 17 


.93 


1.00 


.92 


.88 


.98 


1.00 


4 DIF Items 


Focal:N(0,l) 


Item 1 


.57 


.82 


.36 


.48 


.48 


.71 




Item 4 


.72 


.95 


.64 


.74 


.72 


.95 




Item 10 


.92 


1.00 


.91 


.97 


.96 


1.00 




Item 17 


.89 


1.00 


.96 


1.00 


.98 


1.00 


Focal:N(-l,l) 


Item 1 


.60 


.85 


.29 


.24 


.55 


.81 




Item 4 


.65 


.92 


.34 


.40 


.67 


.94 




Item 10 


.84 


.99 


.71 


.60 


.93 


.99 




Item 17 


.83 


.99 


.81 


.68 


.98 


1.00 
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